Rapid unsupervised speaker adaptation based on multi-template HMM sufficient statistics in noisy environments
نویسندگان
چکیده
This paper describes a multi-template unsupervised speaker adaptation based on HMM-Sufficient Statistics. Multiple class-dependent models based on gender and age are used to push up the adaptation performance while keeping adaptation time within few seconds with just one arbitrary utterance. Adaptation begins with the estimation of speaker‘s class from the N-best neighbor speakers using Gaussian Mixture Models (GMM) on the way of speaker selection. The corresponding template model is adopted as a base model. The adapted model is rapidly constructed using the selected HMM-Sufficient Statistics. Experiments in noisy environment conditions with 20dB SNR office, crowd, booth, and car noise are performed. The proposed multi-template method achieved 89.5% word correct rate compared with 88.0% of the conventional single-template method, while the baseline recognition rate without adaptation is 85.7%. Moreover, experiments using Vocal Tract Length Normalization (VTLN) and supervised Maximum Likelihood Linear Regression (MLLR) are also compared.
منابع مشابه
Spectral subtraction in noisy environments applied to speaker adaptation based on HMM sufficient statistics
Noise and speaker adaptation techniques are essential to realize robust speech recognition in real noisy environments . In this paper, we applied spectral subtraction to an unsupervised speaker adaptation algorithm in noisy environments. The adaptation algorithm consists of the following five steps. (1) Spectral subtraction is carried out for noise added database. (2) Noise matched acoustic mod...
متن کاملRapid unsupervised speaker adaptation using single utterance based on MLLR and speaker selection
In this paper, we employ the concept of HMM-Sufficient Statistics (HMM-Suff Stat) and N-best speakers selection to realize a rapid implementation of Baum-Welch and MLLR. Only a single arbitrary utterance is required which is used to select the N-best speakers HMM-Suff Stat from the training database as adaptation data. Since HMM-Suff Stat are pre-computed offline, computation load is minimized....
متن کاملDoctoral Dissertation Rapid Unsupervised Speaker Adaptation Based on Sufficient Statistics of Hidden Markov Models
In realizing a speech recognition system robust to variation of speakers, an efficient adaptation algorithm is needed. Most adaptation techniques require many adaptation data to carry out an adaptation task. Adaptation data are often collected from the actual speaker itself in several utterances. With the time needed to gather and transcribe the adaptation utterances, together with the actual e...
متن کاملUnsupervised speaker adaptation based on sufficient HMM statistics of selected speakers
This paper describes an efficient method for unsupervised speaker adaptation. This method is based on (1) selecting a subset of speakers who are acoustically close to a test speaker, and (2) calculating adapted model parameters according to the previously stored sufficient HMM statistics of the selected speakers’ data. In this method, only a few unsupervised test speaker’s data are required for...
متن کاملEvaluation on unsupervised speaker adaptation based on sufficient HMM statictics of selected speakers
This paper describes an efficient method of unsupervised speaker adaptation. This method is based on (1) selecting a subset of speakers who are acoustically close to a test speaker, and (2) calculating adapted model parameters according to the previously stored sufficient statistics of the selected speakers’ data. In this method, only a few unsupervised test speaker’s data are necessary for the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005